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Validation of 58 autosomal 




individual identification SNPs in 
three Chinese populations 



Aim To genotype and evaluate a panel of single-nude- 
otide polymorphisms for individual identification (IISNPs) 
in three Chinese populations: Chinese Han, Uyghur, and 
Tibetan. 
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MethodsTwo previously identified panels of IISNPs, 86 un- 
linked IISNPs and SNPforlD 52-plex markers, were pooled 
and analyzed. Four SNPs were included in both panels. 
In total, 132 SNPs were typed on Sequenom MassARRAY* 



Taculty of Basic Medical Sciences, 
Chongqing Medical University, 
Chongqing, P.R. China 



platform in 330 individuals from Han Chinese, Uyghur, and 
Tibetan populations. Population genetic indices and foren- 
sic parameters were determined for all studied markers. 

Results No significant deviation from Hardy-Weinberg 
equilibrium was observed for any of the SNPs in 3 popula- 
tions. Expected heterozygosity (HJ ranged from 0.144 to 
0.500 in Han Chinese, from 0.197 to 0.500 in Uyghur, and 
from 0.018 to 0.500 in Tibetan population. Wright's F a val- 
ues ranged from 0.0001 to 0.1613. Pairwise linkage dis- 
equilibrium (LD) calculations for all 132 SNPs showed no 
significant LD across the populations (r 2 <0.1 47). A subset 
of 58 unlinked IISNPs (r 2 <0.094) with H e >0.450 and F t val- 
ues from 0.0002 to 0.0536 gave match probabilities of 1 0~ 25 
and a cumulative probability of exclusion of 0.999992. 

ConclusionThe 58 unlinked IISNPs with high heterozygos- 
ity have low allele frequency variation among 3 Chinese 
populations, which makes them excellent candidates for 
the development of multiplex assays for individual identifi- 
cation and paternity testing. 
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Single-nucleotide polymorphisms (SNPs) are often used as a 
supplementary tool to short tandem repeats (STRs) analysis 
(1 ) since they show advantages over STRs in degraded DNA 
detection (2), kinship analysis (3), ancestry inference (4,5), 
and physical traits analysis (6). Different SNPs groups have 
been selected according to defined purposes (4,5,7,8). 

In the last decades, several panels for individual identifica- 
tion have been developed (2,9-1 1 ). Kidd et al (1 2) defined 
an ideal SNP panel for individual identification (IISNP) as 
a group of statistically independent SNPs that showed lit- 
tle frequency variation among different populations with 
high heterozygosity. Based on this criterion, Pakstis et al 
selected 86 unlinked candidate individual identification 
SNPs (IISNPs) with average heterozygosity >0.4 and F val- 
ues <0.06 for 44 major populations across the world (1). 
However, the sample sizes for Chinese populations were 
very limited. On the other hand, the SNPforlD consortium 
(www.snpforid.org) developed a 52-plex SNPs assay for 
individual identification (7). This assay was validated by 
Bfrsting et al (1 3-1 5) according to the ISO 1 7025 standard 
and used for routine casework. This assay worked well in 
several European countries but showed a somewhat larger 
frequency variation among populations from other conti- 
nents (16-18). 

In order to collect an ideal SNP panel for individual identi- 
fication and evaluate its performance in Chinese popula- 
tions, we pooled the previous 86 IISNPs and 52-plex SNPs 
together, and typed them in 330 samples of three Chinese 
population groups: Han, Tibetan, and Uyghur. 

MATERIAL AND METHODS 

Sample collection and DNA preparation 

Population samples included 110 Han Chinese recruited 
from Beijing, 1 1 0 Uyghurs from Urumqi, and 1 1 0 Tibetans 
from Lhasa. Patients whose names were obtained from 
general practitioners' registers were invited to participate 
in the study. They were asked to complete a brief general 
information questionnaire including the data about name, 
place, sex, age, and ethnic and racial information about the 
last three generations of their family. Volunteers whose an- 
cestry information met our research requirements were re- 
cruited. In total, 330 whole blood samples of 1 mL were 
obtained from healthy volunteers (1 76 men and 1 54 wom- 
en) who had provided written informed consent. Ethical 
approval was received from the Review Board, Institute of 
Forensic Sciences, Ministry of Public Security. 



DNA was isolated and purified with a QIAamp® DNA blood 
midi kit (QIAGEN, Hilden, Germany). DNA quantification 
was performed with adding 1.5 uL DNA samples in solu- 
tion on NanoDrop® ND-1000 spectrophotometer (Thermo 
Scientific, Wilmington, DE, USA). 

SNP typing 

SNPs genotyping was performed using the Sequenom 
MassARRAY® platform with the iPLEX GOLD chemistry 
(Sequenom, San Diego, CA, USA) following the manufac- 
turer's protocols. Polymerase chain reaction (PCR) primers 
and locus-specific extension primers were designed us- 
ing MassARRAY® Assay Design software package (v. 3.1). 
DNA template of 50 ng was used in each multiplexed PCR 
well. PCR products were treated with shrimp alkaline phos- 
phatase (USB, Cleveland, OH, USA) before the iPLEX GOLD 
primer extension reaction. The single base extension prod- 
ucts were desalted with SpectroCLEAN® resin (Sequenom), 
and then an aliquot of 10 nL of the desalted product was 
spotted onto a 384-format SpectroCHIP® with the MassAR- 
RAY® Nanodispenser. Mass determination was done with 
the MALDI-TOF mass spectrometer. The MassARRAY® Ty per 
4.0 software was employed for data acquisition. 

Statistical analysis 

Population indices including allele and genotype frequen- 
cy, expected heterozygosity (H e ), Hardy- Weinberg equilib- 
rium (HWE), and Wright's F st value, were calculated for each 
SNP marker with Genepop v4.2. F and 6 values were used 
to measure variance in allele frequencies among popula- 
tions. Random match probability (RMP), probability of ex- 
clusion (PE), and other forensic parameters were calculated 
with PowerStats v. 1 2. HaploView v. 4.2 genetics software 
was implemented to estimate linkage disequilibrium (LD) 
values (r 2 ) of each pair of SNPs. HapMap genotype data 
were used for population comparison analyses. All individ- 
ual and marker data were filtered with more than 20% of 
missing alleles. 

RESULTS 

We typed 132 SNPs in 330 individuals from Han, Uyghur, 
and Tibetan subpopulations (Supplementary Table 1). Four 
SNPs were included in both panels and 2 were excluded 
due to the failure of multiplex detection. HWE tests dem- 
onstrated no significant deviation from expected values 
(P< 0.001, after Bonferroni correction for multiple test- 
ing). There were no significant LD values among the 
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SNPs (pair-wise r 2 <0.147). H e ranged from 0.144 to 0.500 in 
Han, from 0.1 97 to 0.500 in Uyghur, and from 0.01 8 to 0.500 
in Tibetan population. Wright's F values were from 0.0001 
to 0.1613. Allele frequencies ranged from 0.1 14 to 0.922 in 
Han, from 0.1 1 8 to 0.889 in Uyghur, and from 0.055 to 0.991 
in Tibetan population. Pair-wise 5 values ranged from 0 to 
0.324 (Supplementary Table 2). 

Among the 132 IISNPs, we identified a set of 58 unlinked 
markers (r 2 <0.094) separated on 22 autosomal chromo- 
somes. This subset consisted of 41 markers from the Pak- 
stis' panel (20 recommended ones), 14 markers from the 
SNPforlD panel, and 3 markers from both panels. H e values 
were >0.450 and F s[ values ranged from 0.0002 to 0.0536. 
Match probabilities of Han, Uyghur, and Tibetan popula- 
tions were 5.91 x 1 0- 25 , 7.79 x 1 0~ 25 , and 6.61 x 1 0~ 25 , respec- 
tively. The cumulative probability of exclusion for these 
three populations was 0.999992. 

DISCUSSION 

Each of the 58 unlinked markers from two IISNP panels had 
high heterozygosity (H e >0.450) and very similar frequen- 
cies (F si <0.060) across the three Chinese population sam- 
ples. One hundred and thirty-two autosomal IISNPs were 
screened in three Chinese populations.These markers have 
previously been tested in many populations (1 ,8,1 6,1 8,1 9). 
In our study, several markers displayed wide frequency 
variation among the three Chinese populations, such as 
rsl355366, rs717302, rs18865l0, and rs1493232. These 
markers were not suitable for individual identification. An 
ideal panel of IISNPs should include more than 50 statisti- 
cally independent markers (low LD values), each with high 
heterozygosity (approaching 0.5) and low F values (ap- 
proaching 0).The panel meeting this criterion should have 
high power of discrimination and allow for universal uses 
of individual identification. Thus, we considered the sub- 
set of 58 unlinked SNPs, which had previously been tested 
for a large number of world-wide populations, an excel- 
lent panel for practical use in Chinese populations. It had 
match probabilities of 10~ 25 , which is comparable to the 
CODIS STR panel and other existing SNP typing systems 
(7,8,19). As we know, in the field of forensic science SNPs 
are now studied with much more interest than STRs. How- 
ever, it is not enough just to develop a SNP panel capable 
of individual identification for a new marker to be widely 
accepted in routine detection. SNPs panels of the Y (20,21) 
chromosomal and mtDNA (22,23) should also be studied 
and developed in terms of requirements for paternal 
and maternal lineage searches of individuals. 



Individual identification requires an easy-to-use multiplex 
SNP genotyping system, which can be read with capillary 
electrophoresis. Furthermore, a stable and reliable detec- 
tion system should include more population samples. In 
conclusion, the 58 unlinked IISNPs identified here would 
bean ideal panel for further multiplex system optimization 
and validation among Chinese populations. 
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